Non-parametric Approximate Dynamic Programming via the Kernel Method
نویسندگان
چکیده
This paper presents a novel non-parametric approximate dynamic programming (ADP) algorithm that enjoys graceful approximation and sample complexity guarantees. In particular, we establish both theoretically and computationally that our proposal can serve as a viable alternative to state-of-the-art parametric ADP algorithms, freeing the designer from carefully specifying an approximation architecture. We accomplish this by developing a kernel-based mathematical program for ADP. Via a computational study on a controlled queueing network, we show that our procedure is competitive with parametric ADP approaches.
منابع مشابه
A Non-Parametric Approach to Dynamic Programming
In this paper, we consider the problem of policy evaluation for continuousstate systems. We present a non-parametric approach to policy evaluation, which uses kernel density estimation to represent the system. The true form of the value function for this model can be determined, and can be computed using Galerkin’s method. Furthermore, we also present a unified view of several well-known policy...
متن کاملOPTIMIZATION OF A PRODUCTION LOT SIZING PROBLEM WITH QUANTITY DISCOUNT
Dynamic lot sizing problem is one of the significant problem in industrial units and it has been considered by many researchers. Considering the quantity discount in purchasing cost is one of the important and practical assumptions in the field of inventory control models and it has been less focused in terms of stochastic version of dynamic lot sizing problem. In this paper, stochastic dyn...
متن کاملBoosted Bellman Residual Minimization Handling Expert Demonstrations
This paper addresses the problem of batch Reinforcement Learning with Expert Demonstrations (RLED). In RLED, the goal is to find an optimal policy of a Markov Decision Process (MDP), using a data set of fixed sampled transitions of the MDP as well as a data set of fixed expert demonstrations. This is slightly different from the batch Reinforcement Learning (RL) framework where only fixed sample...
متن کاملOn the Approximation of a Conditional Expectation
In this paper, we discuss how to approximate the conditional expectation of a random variable Y given a random variable X, i.e. E(Y|X). We propose and compare two different non parametric methodologies to approximate E(Y|X). The first approach (namely the OLP method) is based on a suitable approximation of the σ-algebra generated by X. A second procedure is based on the well known kernel non-pa...
متن کاملSemi-parametric Quantile Regression for Analysing Continuous Longitudinal Responses
Recently, quantile regression (QR) models are often applied for longitudinal data analysis. When the distribution of responses seems to be skew and asymmetric due to outliers and heavy-tails, QR models may work suitably. In this paper, a semi-parametric quantile regression model is developed for analysing continuous longitudinal responses. The error term's distribution is assumed to be Asymmetr...
متن کامل